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Abstract 

We analyze a multiple-input multiple-output (MIMO) radar model and provide recovery results 
for a compressed sensing (CS) approach. In MIMO radar different pulses are emitted by several 
transmitters and the echoes are recorded at several receiver nodes. Under reasonable assumptions 
the transformation from emitted pulses to the received echoes can approximately be regarded as 
linear. For the considered model, and many radar tasks in general, sparsity of targets within the 
considered angle-range-Doppler domain is a natural assumption. Therefore, it is possible to apply 
methods from CS in order to reconstruct the parameters of the targets. Assuming Gaussian random 
pulses the resulting measurement matrix becomes a highly structured random matrix. Our first main 
result provides an estimate for the well-known restricted isometry property (RIP) ensuring stable and 
robust recovery. We require more measurements than standard results from CS, like for example those 
for Gaussian random measurements. Nevertheless, we show that due to the special structure of the 
considered measurement matrix our RIP result is in fact optimal (up to possibly logarithmic factors). 

Our further two main results on nonuniform recovery (i.e., for a fixed sparse target scene) reveal 
how the fine structure of the support set — not only the size — affects the (nonuniform) recovery 
performance. We show that for certain “balanced” support sets reconstruction with essentially the 
optimal number of measurements is possible. Indeed, we introduce a parameter measuring the well- 
behavedness of the support set and resemble standard results from CS for near-optimal parameter 
choices. We prove recovery results for both perfect recovery of the support set in case of exactly 
sparse vectors and an ^2-norm approximation result for reconstruction under sparsity defect. Our 
analysis complements earlier work by Strohmer & Friedlander and deepens the understanding of the 
considered MIMO radar model. 

AMS Subject Classification: 94A20, 94AI2, 60B20, 90C25, 65F22, 

Keywords: MIMO radar, compressed sensing, £i-minimization, restricted isometry property, LASSO, 
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1 Introduction 

MIMO (multiple-input multiple-output) radar systems can simultaneously transmit several uncorrelated 
waveforms from spatially distributed transmitters and record the reflected signals at different receiver 
locations. The transmit/receive antennas may either be widely separated giving the possibility to 
view the targets from different angles or co-located antenna configurations — to be studied in this 
paper — can yield superior resolutions and target identifiability as compared to standard phased-array 
radar [13, 21, 22]. 

When it comes to the implementation of a MIMO radar system involving several waveforms, one 
is naturally confronted with an increased amount of data to be processed. Yet in MIMO radar the 
description of the target parameters can turn out to be particularly sparse, i.e., the target scene can be 
modeled as a vector of the object reflectivities with the property that “almost” all entries are zero [8]. 
Since it is often possible to assume the transformation of the transmitted signals through the channel 
to be approximately linear, one is faced with the reconstruction of the targets’ parameters from highly 
incomplete linear measurements. 
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During the last decade, the theory of compressed sensing (CS) evolved around this particular setting. 
In CS one typically tries to reconstruct an unknown vector x from highly underdetermined linear mea¬ 
surements y — Ax under the additional assumption of sparsity in x. While only suboptimal results are 
available so far for deterministic measurement matrices A — see for instance the discussion in [14, Chap¬ 
ter 6.1] — it is rather typical to consider matrices which are defined in terms of random variables. The 
crucial point is that, apart from theory, also in practice the concept of randomness can be implemented 
efhciently and has already proven effective. This opens the field for advanced methods from probability 
theory making it possible to prove strong recovery results — which usually hold true with very high prob¬ 
ability. Typical results guaranteeing recovery of sparse vectors x from measurements y = Ax provide 
conditions on the minimal number of required measurements — ideally scaling linearly on the number of 
nonzero entries in x up to logarithmic factors. Given this, the theory of CS provides several algorithms 
for reconstruction. Usually x can either be obtained as the solution to a convex optimization problem, 
or by an iterative (greedy) algorithm [14]. 

Ideas from CS have probably first been applied to radar in [1, 17, 12]. The extension to MIMO 
radar has been conducted in [8, 26, 31, 32]. In a recent paper, Strohmer & Friedlander [27] analyze a 
particular model of co-located MIMO radar and prove recovery results. They assume random probing 
signals which leads to a measurement process constituting of NuNt random measurements, where Nr 
is the number of receiver nodes and Nt is the number of (time domain) samples being taken at each 
receiver. Under the additional assumption of randomly distributed targets, they show that basically the 
condition NjiNt > slog(A^), where s is the number of targets and N is the dimension of the target scene 
X, is sufficient for reconstruction by minimizing the so-called LASSO functional. Thus, typical results 
from CS on random measurements are resembled for these specific MIMO radar measurements. 

We adopt this model and deepen its understanding by deriving further properties and results in the 
context of CS. First, we prove an estimate for the restricted isometry property (RIP) of the involved 
measurement matrix. The RIP guarantees uniform recovery (i.e., simultaneous recovery of all sufficiently 
sparse vectors from a single random draw of the matrix with high probability) and, moreover, yields 
strong error estimates for noisy measurements and only approximately sparse vectors. Compared to 
standard estimates for standard random matrix constructions, we require more measurements for a given 
sparsity which, however, as we show, is still optimal and due to the special structure of the considered 
matrix. Motivated by the somehow better results in [27], we furthermore deepen the analysis of the 
measurement process by introducing a parameter for the fine structure of the supports sets. In this way 
we are able to provide nonuniform recovery results for “balanced” support sets resembling the required 
number of measurements one would usually expect from common CS theory. Ultimately, this explains 
the result in [27], since random support sets are — on average — sufficiently balanced and, hence, for 
randomly distributed targets less measurements are sufficient. To the best of our knowledge, it has 
not been observed earlier for realistic measurement matrices that the recovery performance may depend 
significantly on the fine structure of the support sets. 

1.1 The MIMO radar model 

Our MIMO radar model consists of Nt transmit and Nji receive antennas. We consider the setting 
from [27] where some assumptions on the geometry of the antenna/target locations have been made. We 
assume that 

• the antenna arrays and the scatterers are located in the same two-dimensional plane, 

• the transmit/receive antennas are located along one common line (“monostatic radar”), 

• the distance of the targets from the antenna arrays is sufficiently large so that the radar return 
of any scatterer can be considered to be fully correlated across the array (“coherent propagation 
scenario”). 

We point out that our results can be generalized to a three-dimensional setting. However, since the 
effects we want to study already occur in two dimensions, we concentrate on this case in order to keep 
the notation simple. The transmit antennas occupy the positions (0,idTA) € i = 0,1,... ,Nt — 1, 
where A denotes the wavelength of the carrier frequency of the radar system. The receive antennas are 
located at (0, jd^jA), j = 0,1,... — 1. It is known that by choosing dr = 1/2 and da = Nt/‘2 or. 
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Figure 1: Sparse MIMO radar: sector of the physical target domain for the case of Nt = 8 and iV/j = 8 
transmit/receive antennas, yielding an angular resolution of NtN^ = 64. An exemplary delay grid 
with a resolution of iV,- = 8 is depicted. The considered Doppler effect introduces a third dimension, 
parametrized by the Doppler shifts /. 


alternatively, dx = Nr/2 and d/j = 1/2 similar characteristics as those of a virtual array with NtNr 
antennas can be obtained [15]. In this sense these particular choices for dx and d^ are favorable compared 
to other choices in practice. This fact will also become clear during our analysis. Throughout the paper 
we concentrate on the case where 

1 Nx 

dx = 2 - dn = -f. (1) 

The second case can be treated analogously. 

The ith transmit antenna repeatedly sends a fixed complex continuous-time signal Si = Si (t) of period 
duration T. The reflected signals, due to reflections caused by one unit reflectivity target at position 
(rcos(0),rsin(0)) S K^, traveling with radial speed v, will be recorded at the jth receiver (cf. [32]) as 


N'j' 

i=l 

where c denotes the speed of light and dij{t) is the distance the emitted wave has to travel from the fth 
transmitter to the jth receiver at time t, given by 

dij{t) = 2(r -I- vt) + sm{0)dxX{i — 1) + sin(6*)dflA(j — 1). 

After demodulation (multiplication with *), assuming narrowband transmit waveforms, slowly 

moving targets, and a far field scenario, i.e., r ^ maxjdTA^TA, d^NuX}, the received baseband signal at 
the jth receiver is approximately given by 

N'j^ 

yj{t) K. e*2’^-2A-Vg*2,r.sin(e)dH(j-l) ^ g*2^-2A-At g*2^.si„(e)dT (i-1) ^ _ 2r / c). (2) 

i=l 

(See also [32], where the antennas are distributed freely over a small area, though.) 

The parameters of a target are given by the triple {6, r, v) representing the position (in radial coordi¬ 
nates) and the radial speed. It will, however, be more convenient in the remaining part of this paper to 
equivalently consider the triples (sin(6>), 2r/c, 2A“^u) of angle, delay, and Doppler-shift parameters, which 
appear in formula (2). The angle, delay, and Doppler-shift information can always easily be transformed 
back into the physical coordinates. 
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Discretization 

We switch to a discrete setting by considering sampled (at Nyquist rate) versions of the transmitted 
signals and, furthermore, assuming that the targets are located on a grid in the angle-delay-Doppler 
domain. In certain applications, this idealizing assumption might lead to so-called “gridding errors”. 
Thus, in practice, this would have to be resolved by an additional post-processing. Theoretical approaches 
for this problem can be found in [18, 9]. Recently, in [16], a new approach for exact reconstruction of 
time/frequency shifts (as considered in radar) from a continuous domain has been proposed. 

Let Si G C^* be the discrete-time representation of the transmitted signal Si = Si{t) sampled at 
Nyquist-rate over the time interval [0,T), i.e., 

sj = isM,s^{At),s,{2At),...,s,{{Nt-l)At)f, T = NtAt, At = l/2B. 


Under the assumption that the signals Sj = Si{t) are T-periodic and band-limited to {—B, B) the Nyquist- 
Shannon sampling theorem tells us that they are fully described by the discrete-time signals s^. In the 
following we will always assume that the parameters (sin(0), 2r/c, 2A“^u) lie on an equidistant grid, i.e., 

(sin(6»), 2r/c, 2A"^z;) = {PAp^rAr, /A/), (3) 


where 13, t, and / are integers. Throughout the rest of this paper we fix the stepsizes Ap, Ar, and Ay 
to 


^ NrNji 



(4) 


For a given Doppler-shift 2A = /Ay <C B, the received signal yj = yj{t) can again be assumed to be 
band-limited to {—B,B). In this case the time-discrete version 

yJ = (%(o).yj(At),yi(2At),...,?/j((A^t - i)At))^ e c^* 

of the approximate received signal yj = yj{i) in (2) at the jth receiver, caused by a target having 
parameters as in (3), is given by the vector 


Nt 

y _ gi27r-cA“UATgi27r-dB/3A3(y-l) ^j27r-dT/3A^(i-1) g C^* , 

1=1 

where Tr is a circulant shift and M y is a linear modulation, defined entrywise by 

where k — r stands for subtraction modulo A/. Note that, due to periodicity of the operators Tr and M y 
and the periodic influence of the parameter (3 in (5), the maximal achievable number of grid points as 
in (3) which can be differentiated by looking at the received signals j/y is bounded by NtNuNI- Hence 
we assume without loss of generality that 

(/3, r, /) G a, g := [iVTiV^j] x [Nt] x [Nt], (7) 

where, for any L G N, we write \L] to denote the set {1, 2,..., L}. Throughout this paper we will refer 
to the triples of integers (/3, r, /) as parameters of the considered targets keeping in mind that the actual 
physical parameters can be obtained via the relations (3), (4). 


(5) 

( 6 ) 


The linear measurement model 

If, more generally, several targets Ok = i/3k,Tk, fk), k = 1,2,..., L, with corresponding reflectivities 
Pk G C are present, then the superposition of signals 


L 

yj = H • 


gZ27r-cA ^TfoA-r 




N'j' 

g227r-dii/3fcA^(j-l) g227r-c/T^feA^(2-l)^^ 


G C^* 
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will be recorded at the jth receiver, see also (5). We define vectors Aq S C^* with 


Nt 

2=1 


e = {f3,Tj)eg, 


(8) 


which allows us to write Uj more conveniently as the sum X]fe=i ^ target scene (a 

set of targets being present in the angle-delay-Doppler domain) can now equivalently be considered as a 
vector X = {xq, 0 € G) € , N = NtNrN^, being supported on the indices 0 = 0k, k = 1, 2,..., L, 

which correspond to the parameters 0k of the targets. Here, each nonzero entry is basically the reflectivity 
parameter of the corresponding target (multiplied with a complex unit), given by 

The collection of all received signals {yj', , ■ ■ ■, yJjj^) can now conveniently be written as matrix-vector 

product Ax, where the matrix A G contains the columns Aq, 0 gG, with 


Ae = {{A^ef, {A^f,G 


(9) 


Considering noise in the channel leads to the measurement model 

y = Aa; + neC^«^‘, (10) 

where n represents a noise vector. Since a; is an iV = NjiNtN^- dimensional vector and the acquired in¬ 
formation in the measured vector y is to = NuNt-dimensionai, we end up with a highly under-determined 
system (10) from which the targets’ parameters (the support set of the vector x) have to be reconstructed. 


1.2 Recovery via ^i-minimization and the restricted isometry property 

Radar scenes are typically sparse in the target domain meaning s = |supp(a:)| <C N. One can use this 
additional assumption for the reconstruction of x. At this point ideas from compressed sensing enter 
the held. A well established — and computationally practicable — approach for reconstructing a sparse 
vector X from incomplete measurements y is basis pursuit denoising which aims at reconstructing x by 
solving the convex optimization problem 

min||z||i subject to \\Az — y\\ 2 <Q, (11) 

Z 

where g is an upper estimate of the noise level. Here the || • ||i-norm constitutes a convex relaxation of 
the II • llo-norm counting the number of nonzero entries, which one aims to minimize in the hrst place. 
Indeed, under suitable assumptions on the measurement matrix A, each solution x^t of (11) is close to 


the unknown s-sparse vector x, i.e.. 


|a;^ — a; 1 < C inf \\x'— x\\i-\-Dgyfs, 

(12) 

s-sparse x' 


,, „ ,, inf 1 a;'— x|i 

a;# a; 2 <C " +Dg, 

Vs 

(13) 


where C, D > 0 are numerical constants. 

A popular condition guaranteeing recovery via basis pursuit denoising is the restricted isometry 
property (RIP). The RIP is said to be fulhlled if A has a small restricted isometry constant which is the 
smallest number Sg such that for all s-sparse vectors x it holds 

(l-<5«)||a,||2<||Aa.||2<(l + 5,)||a,||2. (14) 

A sufficiently small restricted isometry constant implies that the basis pursuit denoising approach (11) 
yields approximations to any s-sparse vector x from incomplete measurements y. The following theorem 
is well-known [4], see for instance also [7, 5, 14], where constants have been subsequently been improved. 

Theorem 1. Suppose the restricted isometry constant 62 s of the matrix A satisfies 62 s < l/v^- Then, 
for any x G and y G C*” with || Aa; — y ||2 < g, any solution x^t of (11) fulfills (12), (13) where the 
constants C, D > 0 only depend on 623 ■ 
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For certain random matrices, e.g., those with independent standard Gaussian entries, the following 
choice of the number of measurements m depending on the dimension N and the sparsity level s is 
sufficient (and necessary, see below) for a small restricted isometry constant with high probability [14]: 

m > slog{eN/s). (15) 

Apart from Gaussian measurements this dependence also occurs for many other (even structured) random 
matrices — up to possibly additional logarithmic factors. 

1.3 An optimal RIP result 

A main goal of this paper consists in proving the following result on the number of required samples Nt 
guaranteeing a small restricted isometry constant of the scaled MIMO radar measurement matrix (see 
(8), (9) for the definition) 

A = , ^ A e 

y/ NrNfiNt 

and, moreover, in showing that this result is optimal in a certain sense. We assume that the transmit 
pulse vectors Si are independent standard complex Gaussian random vectors (see Appendix D for some 
information about complex Gaussian random variables and vectors). 

Theorem 2. Let the signals Si, S 2 , ..., sjvj, generating the matrix A via (8) and (9) be independent 
standard complex Gaussian random vectors. If, for S,£ € (0,1), 

max{log^(eAf) log^(s), log(l/e)}, (16) 

then the restricted isometry constant of the rescaled matrix A satisfies ds(A.) < <5 with probability at least 
1 — e. 

Remark 3. The result as well as its proof extends to signals Si, S 2 , ..., sjvr being independent Rademacher 
vectors (random vectors with independent entries taking the value +1 or —1 with egual probability) or 
independent Steinhaus vectors (random vectors with independent entries that are uniformly distributed 
on the complex torus). These generators may he of advantage for real radar systems where one prefers 
constant magnitude signals for optimal energy consumption and in order to avoid that amplifiers possibly 
run out of their linear regime. 

Recall that the matrix A has m = NuNt rows, i.e., A represents N^Nt measurements. In this 
sense the theorem may seem suboptimal considering that one typically would expect a scaling as in (15). 
However, using general theoretical results on lower bounds for the necessary number of measurements, 
we argue below that our RIP result above is in fact optimal (up to possibly logarithmic factors). 

Instance optimality 

The concept of instance optimality [10, 14] allows to derive general lower bounds on the number of mea¬ 
surements m required for stable sparse reconstruction via any algorithm and any possible measurement 
matrix in To be precise, we say that a pair {A, A), where A: C™ —^ is a reconstruction map 

and A G is ii-instance optimal of order s with constant C > 0 if for all x G it holds 

jja; — Z\(Aa:)j]i < C inf Jja;' — 

s-sparse x' 

The following result states the announced lower bound. 

Theorem 4 ([14, Thm. 11.6]). If a pair of measurement matrix A G and reconstruction map 

A: C™ —> is ii-instance optimal of order s with constant C, then 

m > C'slog{eN/.s) (17) 

for some constant C depending only on C. 
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In case of the MIMO radar measurement matrix A, this result can be applied to a certain submatrix 
of A. To this end, we define := {(/3, r, /) G C/ : /3 = /3*}, for an arbitrary but fixed angle parameter 
/3*. For a target scene x with support in one finds (recalling (8)) that the product Ax can be 
expressed as outer product w 0 Bxg ^,, where B is an Nt x Nf matrix with columns 


Nt 




and lu is a iV/{-dimensional vector, entry wise defined as 

[m], =e*2..d«rA,0-i)^ jg[iV«], 


and the vector xg^, is obtained from x by deleting all entries which do not belong to the set Gp* ■ This 
implies that ||Aa ;||2 = ||Ba:g^. jH, where B stands for the scaled matrix -j==B. Consequently, the 
restricted isometry constant 62 s{B) is bounded by <52s(^)- We conclude that 62 s{A) < ll\/2 implies 
S 2 s{B) < ll\/2 so that by Theorem 1 the matrix B in combination with the reconstruction map corre¬ 
sponding to basis pursuit (11) with p = 0 is £i-instance optimal. Since B has Nt rows and Nf columns, 
(17) reads as Nt > slog(eA^(^/s). Therefore, we can formulate the following result. (Essentially, this 
result could have similarly been derived from Corollary 10.8 in [14].) 

Theorem 5. If the restricted isometry constant of the scaled MIMO radar measurement matrix A G 
considered in Theorem 2 satisfies 623 < l/v^, then it holds 

Nt>s\og{eNf/s). (18) 


This shows that — up to additional logarithmic factors — Theorem 2 on the restricted isometry 
constant for the MIMO radar measurement matrix is indeed optimal. Analogously one can also argue 
that £i-instance optimal uniform recovery in general — which is not necessarily based on the RIP — is 
not possible with less measurements than in (16). Indeed, by assuming that there exists a corresponding 
reconstruction map for the matrix A one can use this map for constructing an ^i-instance optimal 
reconstruction map for any submatrix in B as defined above, which again yields (18). 


1.4 Nonuniform recovery for random support sets 

As stated by Theorem 2, for a fixed draw A of the measurement matrix the RIP is fulfilled with high 
probability and, hence (due to Theorem 1), all sparse vectors x can be reconstructed by considering 
measurements Ax. Since in this scenario A is fixed and reconstruction is possible for all sparse vectors 
x, we speak of a uniform recovery result. A nonuniform recovery result, on the other hand, only states 
that a given sparse vector x can be reconstructed from measurements Ax with high probability on the 
draw of the matrix A, i.e., no assertion on the reconstruction of all sparse vectors using a single draw of 
the matrix A is made. 

The results on the MIMO radar measurements by Strohmer & Friedlander [27] imply that, at least 
on average, nonuniform reconstruction (for random support sets) succeeds with less measurements. In 
[27] they derive results for nonuniform reconstruction via the debiased LASSO reconstruction scheme. In 
case of the standard LASSO, an approximation of x is obtained by solving the minimization problem 

mmhAz-y\\l +X\\z\\i, (19) 

2 2 

where A > 0 has to be chosen appropriately (in accordance with the noise level). In case that the support 
S = supp(a;) is recovered correctly, i.e., supp(a;'^) = S, one might add a further step consisting of solving 
the reduced least squares problem 

mm\\Asz-y\\l, (20) 

Z 

where As contains only the columns of A corresponding to the support set S and the minimization is 
over all jz G This second “debiasing” step leads in general to an improvement of the reconstructed 


xN 
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coefficients. The LASSO approach is closely related to basis pursuit denoising ( 11 ) but due to the 
unconstrained optimization often favorable in practice [14]. Strohmer & Friedlander assume a random 
target model, assuming that for a fixed sparsity parameter every support set occurs with the same 
probability and, moreover, the phases of the nonzero entries of the targets are independent and uniformly 
distributed in [0,27r). This model has been studied in [ 6 ] in a general context and is called the generic 
s-sparse target model therein; see also [14, Chap. 14]. Under additional technical assumptions, it has 
been shown in [27]^ that if x is an s-sparse target scene, then 

NRNt>slogiN), ( 21 ) 

measurements are sufficient to guarantee that, with high probability, each solution to the LASSO has 
in particular the exact same support set as the target vector x, thus perfectly recovering the parameters 
of the targets. 

1.5 Nonuniform recovery for deterministic support sets 

The results by Strohmer & Friedlander imply that there must exist support sets which can be recon¬ 
structed using less measurements (than in our RIP result, see Theorem 2). As it turns out, certain 
angle classes are not correlated among each other and, hence, support sets where the mass is uniformly 
distributed among these angle classes are favorable for reconstruction. 

Balanced support sets 

Recall the definition of the columns Aq, 0 G G, oi the MIMO radar measurement matrix A from ( 8 ), 
(9). For the inner product of two such columns Ae, Aqi it holds 


Nr I Nt Nt \ 

(A0,Ae/) = V 

fc=l \i=l i=l / 

( 22 ) 

Due to the fact that dn = Nt/2 and = 2/NtNr, the first factor can be calculated as 

Nr 

^g*2^.dH(/3'-/3)A3(fe-l) ^ 
fe=l 

This means that for two columns A(^/to be correlated it is necessary that P' — (3 G NrL. 

The latter condition induces an equivalence relation on the set of possible angle parameters. We say that 
two angle parameters /3, /?' are equivalent if and only fi' — fi G NrL, i.e., 

13'^ P'-pG NrZ, 

which, recalling from (7) that (3 is taken from {1,2,NtNr}, leaves us with Nr angle classes [/3], 
each containing Nr different angle parameters. The crucial fact is that, since the columns from different 
angle classes are uncorrelated, the matrices ArAr, S cQ, are block diagonal. It holds in particular, 


l]AsAs - Idl] 2^2 = max l]As|^j Asj^, - Idll 2 ^ 2 , 

(23) 

where the subsets 5'[/3] C ^ are defined as 


:= {O' = {l3',T',f)GS:(3' ^{3}. 

(24) 


Considering (23) one may expect that a uniform distribution of any given support set S over all angle 
classes would make it most likely that the matrix ArAr is well conditioned (since in this case the 
maximal dimension of the submatrices A^^^^As^^j becomes minimal). This motivates to introduce a 
parameter 77 measuring how evenly distributed over the angle classes such a support set is. 


where <5 


/3./3' 


1 , up'-PgNrZ, 
0 , else. 


^Note that there is a typo in [27, Thm. 5] where an additional factor Nt on the left-hand side of (21) shows up. 



Definition 6. The balancedness parameter of a support set S G Q is the smallest number rj such for all 
angle elasses [j5] it holds 

ISmI < 

We say that S is rj-balanced, if the balaneedness parameter of S equals rj. 

According to this definition the balancedness parameter can expressed as 

NrISip^I 

77 = max———. 

m | 5 | 


Clearly, the balancedness parameter takes values in [1, Nn], where 77 = 1 means that the IS”! targets are 
perfectly balanced with respect to the Nji angle classes (with ~ |S'|/A^i{), and, on the other hand, 
77 = Nfi corresponds to the case where all targets are located in one particular angle class. 


Nonuniform recovery results 

Next we present our nonuniform recovery results for s-sparse target scenes with 77 -balanced support 
sets (see Definition 6 above) and, thereby, reveal an essentially linear dependence of the number of 
measurements m = NuNt required for reconstruction on the number of targets s. Note, how the 
parameter 77 enters the corresponding condition (25) below. In the worst possible case where 77 = we 
obtain essentially the same scaling as for the RIP estimate in (16), while for the best possible case 77 = 1 
we obtain the announced linear scaling. 

For the following we recall that a Steinhaus sequence is a sequence of independent random variables 
that are uniformly distributed on the complex unit circle {z G C : \z\ = 1}. Due to the fact that the grid 
Q (see (7)) is N = NxNfiNf-dimensional, a target scene x can be regarded as a vector in C^. 

Theorem 7. Let x G be an s-sparse target scene with rj-balanced support set and assume that 
measurements y = Ax -\-n G are given, where the signals Si, S 2 ,..., generating the columns 

of A (according to (8), (9)^ are independent standard complex Gaussian random vectors, and the noise 
vector n is a mean-zero complex Gaussian random vector with variance . If 

NRNt>r,s\og^{N/e), (25) 


and the signs of the nonzero entries of x form a Steinhaus sequenee while the magnitudes satisfy 
min \xe\ > , \/21og(Ar), 

0Gsupp(a;) yJNj'NjiNt 

then, with probability at least 1 — 7 max{ 5 , 

(a) the solution to the LASSO (19) with A = 2a\/2NTNRNt log{N) satisfies 
supp(a;^) = supp(a;). 


(b) the solution to the least squares problem (20) with S = supp(a;^) satisfies 


(27) 


Note that, in view of the scaling of s in (25), the approximation estimate (27) actually implies 
that — a; 5||2 < a/y/Nr, where Nt is the number of transmitters. Furthermore, we would like to 
point out that, for technical reasons, we did not consider deterministic signs of the nonzero entries of 
X. Nevertheless, we do believe that with similar ideas as in [19] such a further generalization would be 
possible (although technical). 

The next theorem, our second main result on nonuniform recovery, makes an assertion about recon¬ 
struction via basis pursuit denoising. Unlike the previous Theorem 7, the following result deals with 
target scenes which are not exactly sparse but rather can be approximated well by sparse vectors. Again, 
the balancedness parameter 77 of the considered support sets is essential. 
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Theorem 8. Let x G be a target scene, let S C G be an index set corresponding to s largest (by 
magnitude) entries in x, and let S be rj-balanced. Further, assume that the signs of the coefficients xs 
form a Steinhaus sequence. Assume measurements y = Ax + n G are given, where the signals 

Si, S 2 , ■ ■ ■, S]\[r^ generating the columns of the measurement matrix A (according to (8), (9)^ are inde¬ 
pendent standard Gaussian random vectors, and the noise vector n is known to satisfy \\n II 2 < 0. If 


NnNt > r]slog^{N/e), (28) 

then, with probability at least 1 — s, the solution to the basis pursuit denoising program (11) satisfies 

Q\fs 


\\x*-x\\ 2 <Ci inf \\x'- x\\i F C 2 - r———, 

s-sparse x y 


where Ci and C 2 are numerical constants. 

Remark 9. Like Theorem 2 concerning the RIP, Theorem 7 and 8 extend to signals Si, S 2 ,..., sat^, 
being independent Rademacher and Steinhaus sequences, compare also with Remark 3. 


1.6 The Doppler-free scenario 

In [27] also a Doppler-free scenario (for the case of slowly moving or stationary targets) has been ana¬ 
lyzed. We want to point out that this case is also covered by our analysis. The measurement matrix 
A' corresponding to the Doppler-free case can be obtained from A by deleting all columns with 

nonzero Doppler shifts /. Therefore, in view of (14), our RIP result (see Theorem 2) holds also automat¬ 
ically true for the Doppler-free case — where one may even replace the factor log(7V) = \og{NTNjiNf) 
by log{NTNfiNt). The proof of our nonuniform recovery results, stated as Theorems 7 and 8 , is based 
on Proposition 13 providing estimates for the singular values of the matrices As where S stands for an 
arbitrary (but balanced) support set. These estimates, of course, hold in particular true when consider¬ 
ing only matrices As with the property that S does not contain any index (/3, r, /) with / 7 ^ 0 and is 
balanced. The reader might assure himself that all remaining arguments in Section 3 hold equally true 
for the Doppler-free case and, thus, analog results on nonuniform recovery can be proven. 

Outline 

In Section 2 we prove the RIP result (Theorem 2) by reformulating the restricted isometry constant as 
the supremum of a chaos process and then applying a general result on such processes shown in [20, 11]. 
Section 3 is devoted to the proofs of our nonuniform recovery result for balanced support sets (Theorems 
7 and 8). The proof of Theorem 7 is based on a general reconstruction result for the LASSO approach 
taken from [6] which we introduce in Section 3.1. The Sections 3.1.1-3.1.3 are devoted to the verification 
of the assumption needed for applying the reconstruction result. To this end a probabilistic analysis of 
the extremal singular values of the matrices As, which we state as Proposition 13, is essential. Since 
the proof of Proposition 13 is rather extensive it appears in an extra section, namely Section 3.3. In 
Section 3.2 we prove Theorem 8 by applying a more general recovery result for the basis pursuit denoising 
approach, taken from [14] . Also in this case the verification of the assumptions depends crucially on the 
analysis of the singular values of As provided by Proposition 13. Finally, Section 4 is devoted to some 
numerical experiments supporting our theoretical results on the influence of the balancedness parameter 
77 (of the considered support sets) on the recovery performance. 

In the Appendix we list basic calculations and technical proofs which for the sake of a clearer pre¬ 
sentation do not appear in the main part. Furthermore, some tools from probability theory and a short 
introduction of standard complex Gaussian random variables and vectors can be found here. 


Notation 

For a complex number z we write z for the complex conjugate and, for nonzero z, we set sgn(z) := z/Jz]. 
Given a vector x = {xi,X 2 , ■. ■)^ with complex entries, we write l]a;llp = (X)fe I < p < 00 , to 

denote the usual t'p-norm of x, and IJa^lloo = maxf, \xk\. Moreover, supp(a;) = {fc : x/c 7 ^ 0} denotes the 
support set of x. Occasionally we also write \x\k to denote the fcth entry of the vector x. We write 
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Id for the identity matrix (with appropriate dimensions becoming clear from the context). Given a 
complex valued matrix A, we write to denote the A:th entry of the £th row of A. The Hermitian 

transpose will be denoted by A*. The spectral norm is denoted by ||A|| 2->.2 = max||^|| 2 =i ||j4a;||2. The 

Frobenius norm of a matrix A is given by ||A||i;’ = £■ Given the sequence of singular 

values (t(A) = (ai(A), 172(A), ...) of a matrix A, the Schatten p-norm, for 1 < p < 00 , is given by 
Mllsp := lk(A)||p. Note the special cases ||A||s 2 = ||A||f and ||A||s^ = ||A|| 2 -> 2 - For a set A of 
matrices, the parameters d 2 ^ 2 (A) = sup^g^ ||A|| 2 ->. 2 , dF(A) = sup^g^ II^IIf denote the diameters of 
A with respect to the spectral norm and the Frobenius norm, respectively. Given a number L € N, we 
write [L] to denote the set {1,2,..., L}. We will also make use of the symbol or, more generally, 
, for fc, £ S Z and L S N. Here is the usual Kronecker delta which is equal to 1 if /c = ^ and 0, 
otherwise. The symbol on the other hand, represents equality up to multiples of L, i.e., 
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k,l 


1 if fc — £ G LZ, 
0 otherwise. 


(29) 


We write A< B \i there is a constant ci with A < ciB, where A, B may depend on further parameters. 


2 Uniform recovery via the RIP 

In this section we prove Theorem 2 concerning the RIP of the MIMO radar measurement matrix. In the 
following, s denotes the vector containing all signals Si, i.e., 

S = (S^, , . . . , e 


2.1 Reformulation as a chaos process 

Our proof is based on [20] where bounds for certain chaos processes have been developed. The key is to 
reformulate the radar measurements x 1 —Ax as a mapping, x 1 —>■ I4;S, where the matrix 14) depending 
on X, is given by 


14 := 


1 

ANrNiiNt 


xeXe, 

eaQ 


(30) 


\i ?1 

and where Xq is a Nr x Nt block matrix consisting of the Nt x Nt blocks Xq with 

Xq'^'^ ■= g*27r-dK/3A3(i-l)gi27r.dT/3ApO'-l)^^ 

Due to the fact that the signal vector s is isotropic, we have for each x that EjlV^sjlf = ||I4 ||f, and, 
due to the fact that the matrices --j====Xe, 0 £ Q, are orthogonal (see Appendix A), ||I4||f = 

This enables us to express the restricted isometry constant 6 s as supremum of a chaos process of the 
form 


ds = sup |||Aa;||2 - ||a:||^| = sup |||Vs||^-E||V' s||2|, (32) 

s-sparse as ~ 

||a,||2<i VGA 

where A denotes the set of matrices given by 

A := {Vx : X is s-sparse, ||a :||2 < 1}. (33) 


Dudley’s entropy integral 

pd2^2{A) 

V:= ^J\ogM{A, II • \\2^2,u)du (34) 

^0 

for the set A, where N{A, || • || 2 ->. 2 ) u) denotes the covering numbers, i.e., the minimal number of || • || 2 ->. 2 - 
balls of radius u required to cover A, provides sufficient information about the complexity of the set 
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A for us. Estimates for the parameter V will in fact lead to good bounds for the restricted isometry 
constant in (32). The following theorem — which is a direct implication of [20, Thm. 3.1] and a slight 
improvement due to Dirksen [11] — is our crucial tool. Recall that a random vector X is L-subgaussian 
if it is isotropic and P(l(X,^)j >t)< 2 exp(—t^/2L^) for every ^ with jj^j ]2 = 1 and any t > 0. 

Theorem 10. Let A be a symmetric set of matrices, —A = A, and let ^ be a random vector whose 
entries are independent, mean-zero, variance 1, and L-subgaussian random variables. Set 

E = V{V + dF{A)), 

V = d 2 ^ 2 {A)dFiA), 

U = d2^2i-^)- 
Then, for t > Q, 

P(sup |jjA^jj 2 — > Cl if + t) < 2 exp(—C 2 min{t^/E^, t/17}), 

A£A 

where the constants ci,C 2 depend only on L. 

Note that the original result [20, Thm. 3.1] is formulated in terms of Talagrand’s chaining functional 
72 , whereas Theorem 10 from above is formulated in terms of Dudley’s entropy integral a common 
upper estimate for the 72 -functional, which is sufficient for our purposes and more convenient. For more 
details on 72 -functionals we refer to [28, 29]. In the original version of Theorem 10 in [20] the quantity V 
was defined as R = d 2 ^ 2 {A){'D -j- dF{A)). Dirksen [11] achieved a slight improvement by showing that 
the summand T) can be omitted. 

2.2 Proof of Theorem 2 

In order to apply the above theorem we have to bound dF{A), d 2 ^ 2 {A), and T). Estimates for the 
quantities d 2 ^ 2 {A) and dF{A) can be derived in a straightforward manner. We have already seen that 
II1 |f = ||a ;||2 and, thus, dF{A) = 1. In order to provide a bound for d 2 ^ 2 {A), we estimate the norms 
||'^e|| 2->.2 of the matrices from the definition in (30). The (/, j)th Nt x Nt block of the product XqXq 
can be calculated (cf. Appendix A, equation (74)) as 

[XqXo]'^^' 

Thus, by defining a vector v G entry wise as 

j.y]_g*2^.dr/3A,0-l), jg[iVr], 

the corresponding operator norm can be calculated as 

Il‘^©ll 2 -i .2 = ||A’eAre|| 2 -s .2 = Nii\\vv*\\ 2^2 = -^flll^ll 2 = NfiNx. 

Since for each s-sparse vector x we have Jja:]]! < •v/slla;jj 2 , we obtain 

For each 14 G A we have, by definition, that ||a ;||2 < 1 which means that 

dp (A) = 1, and ^ 2-12 (A) < (36) 

In order to estimate the entropy integral T) in (34), it is in particular necessary to provide good 
bounds for the appearing covering numbers A/"(A, || • jj 2 ->. 2 ,w)- In view of (36) it is sufficient to consider 
0 < u < ^Js/Nt- The proof of Lemma 11 can be found in Appendix B. 
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Lemma 11. For 0 < u < ^ s/Nt, it holds 

logAr(A|| • 112 ^ 2 , u) < sminjlog ■ (37) 

Now we are able to estimate Dudley’s integral as 

77 < ^J^^\og{eN) log(s). (38) 

To this end, we split the integral into two pieces, 

pd2^2{A) 

77 = / y^l0gAf{A, II • ||2^2,w) du 

^0 

1 

< J ^J\ogM{A, II • || 2 ^ 2 , u) du + y ^ ^/\ogM{A, II • || 2 ^ 2 ,-«) du =: Xi +I 2 . 

We estimate the first integral by using the first bound of Lemma 11 to obtain 


< 




/slog 


N 

u^Nf 


du < -v/s-i 




1 du 




log 


N 

u^Nt 


du, 


where, in the second step, we used the Cauchy-Schwarz inequality. The latter integral can easily be 
calculated as 

1 i/\/ivr 


log 


N 

u^Nt 


du = 


logiN/Nt) - log(M") du = -y= log(7V/W) - 




M(log(u^) - 2) 




(log(7V) + 2), 


which, together with the estimate from above, yields 


< y^v'iog(iv)+ 2 . 

For the second integral I 2 we use the second bound provided by Lemma 11 and obtain 

l 2 < y^^log^(^ du = y^log(iV) u-^ du= J^log(7V)log(ys). 

A combination of the estimates for Ii and I 2 from above yields (38). 

Now, having a bound for the entropy integral H, we are well equipped to apply Theorem 10 and, 
thereby, prove Theorem 2. To this end let c > 0 denote a constant to be chosen sufficiently large and 

Nt > c(5“^smax{log^(eiV)log^(s),log(l/e)} . (39) 

Due to (38) it follows that, for some constant C > 0, 


77<Cy^log(eiV)log(s). 

These two bounds imply that V < Cdjsjc and, therefore, since dp{A) = 1, we obtain for the quantity E 
from Theorem 10 (if c is chosen sufficiently large), 

E = V{V + 1)< {CSf/c + Cd/^c < —, 

ZCi 

where c\ is the constant from Theorem 10. A direct application of Theorem 10, again using dF{A) = 1, 
yields 

P(^s > (5) < P((5s > ciE + S/2) < 2exp ^ ^exp > 

where for the last inequality we used the estimate for d 2 ^ 2 {A) from (36). Due to (39), the latter term 
is bounded by e, again provided that c is sufficiently large, which finalizes the proof of Theorem 2. □ 
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3 Nonuniform recovery results 

The proofs of our nonuniform recovery results (see Theorems 7 and 8) rely on the fact that — for 
sufficiently well balanced support sets S C G — the singular values of the matrix As are close to one 
with high probability. Along with further properties, this enables us to apply general recovery results 
for both the LASSO and basis pursuit denoising which we take from [6] and [14], respectively. In the 
following analysis we consider a scaled version of the considered measurement process, namely 



y = Ax + n, 


In the situation of Theorem 7, n is now a mean-zero complex Gaussian random vector with variance 
^2 = a’^/NrNRNt. 

3.1 Exact support recovery via the (debiased) LASSO 

In order to prove Theorem 7 we use a general recovery result from [6], which provides the sufficient 
conditions (Ci)-(C' 5 ) below on fixed instances of the measurement matrix A, the target scene vector x and 
the noise vector n that imply perfect reconstruction of the support set supp(a;) from the measurements 
y via the LASSO 


mini||Ax-y||2-b2/r||a:||i, y = a\j2 log(V). 

33 Z 


(40) 


Later (see Sections 3.1.2, 3.1.3 below) we will show that these conditions are fulfilled with high probability. 
In the following we write IIs to denote the projection onto the linear space spanned by the columns of 

As. 

(Cl) The matrix AgAs is invertible and obeys ||(AgAs)“^|| 2->.2 < 2, 

(C2) ||AscAs(A5ls)"^sgn(a;s)||oo < 1 / 4 , 

(C 3 ) II (AgAs) ^Agn||oo < 2^, 

(C4) ||Agc(Id - ns)n||oo < \/2/r, 

(Cs) ||(AgAs)"^sgn(a;s)||oo < 3. 


In [6] it is shown, using Conditions (Ci)-(C 5 ), that the difference vector h = x — x between the 
solution X to the LASSO and the original vector x fulhlls supp(Ii) C supp(a;) so that supp(l:) = supp(a;) 
and, even more, the explicit formula (41) for h is given. 

Lemma 12 (cf. [6, Lem. 3.4]). Let y = Ax + n be given, and suppose that Conditions (Ci)-(C 5 ) hold 
true, with S = supp(a;). If 

min |a:el > 8u, 
ees' ' 

then the solution to the LASSO (40) is given by x = x + h, with 
hs = (AgAs)"^(Agn-2/rsgn(a;s)), 

hsc = 0 . ^ ’ 

3.1.1 Estimates for the conditions (Ci)—(C 5 ) 

With Lemma 12 at hand we have a basic pattern for the proof Theorem 7. Indeed, one easily verifies 
that, assuming Conditions (Ci)-(C 5 ) are fulfilled, Theorem 7 follows by rescaling the LASSO functional 
(19) with the factor and recalling that a = a/\/N jLTrNI. In order to show that Conditions 

(Ci)-(C 5 ) hold true with high probability we conduct a probabilistic analysis of the extremal singular 
values of the matrix Ag. Since columns of the matrix As belonging to different angle classes (see Section 
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1.4 for the definition) are not correlated, it suffices to consider submatrices corresponding to the subsets 
C S (see (24)) representing equivalent indices. In this sense, the following lemma is the main 
ingredient for the verification of conditions (C'i)-(C' 5 ). The proof of Proposition 13 is rather long and, 
therefore, postponed until Section 3.3. 

Proposition 13. Let S G Q he a support set and let [/?] be an angle class such that C S is not 
empty. Then, for all S G (0,1), 

Before we turn to the verification of Conditions (C'i)-(C' 5 ), we derive two corollaries which follow 
directly from Proposition 13. We have already pointed out in (23) that, due to the fact that columns of 
A belonging to different angle classes are not correlated, it holds 

FiWAsAs - Id||2^2 > <5) = P(max||rsj^jls,^j - Id||2^2 > S). 

Using the union bound and the tail bounds provided by Proposition 13 we arrive at 

where the sum is taken over all angle classes [/3] (with the property that is not empty). Now it 
suffices to plug in the inequality |5'[/3]| < r]\S\/Nfi which applies for r^-balanced support sets to obtain 
the following corollary (recall, that there are exactly Nn angle classes [/3]). 

Corollary 14. For each rj-balanced support set S G Q it holds 


P(||A;As - Id|| 2^2 > <5) < 77;|,5| exp 

We use the estimate (42) to derive a bound on the coherence of A. Note, that the coherence of this 
matrix fulfills 

max K^de, Ae/)| = max max \{Aq,A 0 i)\. 

0 ^ 0 ' [/3] e.e'GS[3j,eye' 


SvWI / 


'AsA5-Id||2^2>-5)< 


[/ 3 ] 


Let O, O' G be given. For any matrix M and each entry rriij it holds \mij \ < || A4'||2-).2- Now, since 
{Aq, Aqi) is an entry of the matrix AgAs — Id, where we set S := {6>, O'}, we obtain 

P(Kle,le')| >u)< 14exp 

from (42). Since we can choose < {N/NrY/2 pairs 0,0' and there are Nr angle classes, an 

application of the union bound yields the following corollary. 

Corollary 15. The coherence of the matrix A satisfies 

P(mgjae.ie.)l > «) < ^exp (-^) - 


Next we will derive upper bounds for the respective probabilities that one of the conditions (C'i)-(C 5 ) 
fails to hold. 


Condition (Ci) 

We define events Cf as 

Cf = {||i;is-Id||2^2 <^<l/2}. 

Clearly, if Cf occurs (i.e., 6 G (0,1/2]), then it holds ||(A^As)”^|| 2->.2 < 1/(1 — <5) < 2. Hence, Corollary 
14 can be used to show that condition (Ci) is fulfilled with high probability. 
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Condition (C 2 ) 

Lemma 16. Let S G Q be a ij-balaneed support set. If there exist 6 € (0,1) and u > 0 sueh that 

- Id||2^2 < ^, ^axj(le,le')| < U, (43) 

then it holds 

P(||lscls(AsAs)-isgn(a;s)||oo > 1/4) < 2jVexp 

Proof. For O € set vq := {AgAs)~^ AgAe. Since sgn(a;s) forms a Steinhaus sequence, the Hoeffding- 
type inequality for Steinhaus sequences of Lemma 31 yields 

P(|(ne,sgn(a;5))| > 1 / 4 ) < (44) 

In order to derive a bound for the norms we calculate, 

ll^elU = ||(i;A5)-^A;Ae||2 < \\iAgAs)-%^, 

V 0 es 

By assumption, ||(AgAs)“^|| 2-).2 is bounded by (1 — <5)“^. Since S is 77 -balanced, there are at most 
r]\S\/Nii indices 0 such that the summands are not equal to zero (recall that columns corresponding 
to different angle classes are not correlated). Together with the fact that each such inner product 
\{Aei, Ae)\ is bounded by u, we have ||i’e ||2 < 'u-\/'n \^\/(1 ~ S)V^R- Inserting this estimate into (44) 
and furthermore applying the union bound yields the assertion. □ 


Condition (Cs) 

Lemma 17. If the matrix Ag obeys ||AgA 5 — Id|| 2->.2 < <5 < 1, and sgn(xg) is a (random) Steinhaus 
vector, then it holds 

lP(ll(^S^s)"^sgn(a;s)||oo > 3) < . 

Proof. An application of the triangle inequality yields 

ll(^S^s)"^sgn(a:s)||oo < || sgn(a:s)||oo + ||((A5As)"^ - Id) sgn(a;s)||oo. 


Since the first summand on the right-hand side is equal to 1, it suffices to show that the second term 
is bounded by 2 with high probability. Let n©, 0 G S, denote the rows of the matrix {AgAg)~^ — Id. 

jjt 

Due to the assumption it holds ||Id — AgAg\\ 2^2 < d. Thus, 


||7;e||2 = ||((A;A5)-i-Id)ee||2 


^(Id-4As)'=ee 


< 


l-(5' 


lid 




< 


(5 

l-(5' 


Using Hoeffding’s inequality for Steinhaus sums (Lemma 31) we obtain 
P(|(i;e,sgn(a;s))| > 2) < 


Finally, the assertion follows by applying the union bound. 


□ 


Condition (C 3 ) 

Lemma 18. If the matrix Ag obeys ||j4gAs — Id|| 2->.2 < 1/2, then, for a mean-zero Gaussian vector n 
with independent entries of variance and pi = cry^2 log(iV), it holds, 

Pn-dl(^s^s) ^Agnjloo > < N 
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Proof. For 0 G S' we write vq to denote the row of the matrix (A^As) ^ Ag corresponding to the index 
0. An application of Lemma 29 gives, since a~^n is a standard complex Gaussian random vector, 

Pr.(Kt;e,n)| > 2^) =P„(|(t;e,d-in)| > 2v'21og(7V)) < 

Since V 0 , 0 G S, is a column of the matrix M := A 5 (AgAs)“^, the operator norm of M provides a 

— * — 

uniform bound for the £ 2 -norms of the rows V 0 . By assumption HA^As — Id|| 2->.2 < 1/2, and, hence, 

ll^ell^ < \\M\\l^, = ||M*M||2^2 = ||(a;A5)-^||2^2 < (1 - 1/2)-^ = 2 . 

Combining this with the tail bound above yields 
Pr.(|(t^e,n)| > 2 m) < e-8'°sW/2 = N-\ 

An application of the union bound, using |S| < N, implies the assertion. □ 

Condition {C 4 ) 

In order to deal with Condition ( 04 ), we need an estimate in the f 2 -norms of the columns of the matrix A. 

Lemma 19. The £ 2 -norms of the columns of the matrix A satisfy 

P(max||Ae ||2 > 2/V3) < 

See 

Proof. It follows from the definition of the columns A in ( 8 ) that ||Ae|| 2 , 0 G t/, is given by --^|| 2;||2 
with 2 ; = where the Si are independent standard complex Gaussian vectors. 

Thus, also z is an A^t-dimensional standard complex Gaussian random vector so that Lemma 30 yields 

P(||Ae ||2 > 2/^3) = P(|| 2||2 > ^/^t + (2/^3 - I)^) < e-(2/^/3-lFA^,^ 

The assertion follows by applying the union bound. □ 

Under the assumption that the columns are bounded in the £ 2 -norm, we are able to provide an 
estimate for the probability that condition ( 04 ) fails to hold. 

Lemma 20. Let n be a mean-zero Gaussian vector with covariance matrix a^ld and /i = ay'TfogfN). 
If ||Ae ||2 < for all 0 G S'^, then 

P„(||As.(Id - ns)n||oo > ^m) < N-^- 

jjt 

Proof. Note that the £ 2 -norms of the rows of the matrix Age (Id — IIs) are uniformly bounded by 
|l(Id - Ils)Ae \\2 < ||Id- 115112^2 max IIA 0 II 2 < 2/^3. 

<1 

Using the same arguments as in the proof of Lemma 18 (now with V 0 , 0 G S'^, denoting the columns of 
the matrix (Id — ns)Ae), we obtain 

f‘rv{\{v 0 ,n)\ > ^m) < = 7V-^ 

where, in the last step, we used the bound on the norm ||t>e ||2 from above. An application of the union 
bound, using that < N, implies the assertion. □ 
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3.1.2 Proof of Theorem 7(a) 


Now we are prepared to finalize the proof of Theorem 7. In view of Lemma 12, we only have to verify 
that Conditions (C'i)-(C' 5 ) are fulfilled with probability exceeding 1 — 7max{e,7V“^} in order to show 
part (a) of Theorem 7. As we have already noted, Condition (Ci) holds, if ||AgAs — Id|| 2->.2 < iJ, for 
some 6 G (0,1/2]. Thus, Corollary 14 can be used to show that (Ci) is fulfilled with high probability. 
Moreover, a careful inspection of Corollaries 14, 15 and Lemmas 16-20 reveals that the probability of the 
assertion of Theorem 7 failing to hold true is bounded by the probability F{-'As,u) + Y^k =2 ^{~^k\As,u), 
where each Ck stands for the event when Condition (Ck) is fulfilled, and the event As,u is defined as 


As,u = {||AgAs - Id||2->2 < C { \{Ae,Ae')\ < u} n {max||Ae||2 < 2 /V 3 } . 


(45) 


=cf 


=-.e^ 


=:£2 


Now, Corollaries 14, 15 and Lemma 19 imply 

p(-- 45,„) < p(-cf)+p(-fn+p(-f2) 


< 


7,1^1 exp + I^exp ^ ^ 

8v^ ) Nr ^ V 8V2 J 


(46) 


Due Lemmas 16, 17, 18, and 20 we have the following bounds on the failure probability of Conditions 
(C' 2 ), (C 5 ), (Ca), and {C 4 ) — conditional on the event Asy- 


-<l2\As,u) < 

< N-\ 


P(-C5|A-) < 
P(-C4|A-) < N-^- 


(47) 

(48) 


Since we claim that the assertion of Theorem 7 holds true with probability exceeding 1 — 7 max{e, N~'^}, 
it is enough to verify that each of the seven terms in (46), (47), (48) is either bounded by e or by N~'^. 
Since this holds trivially for the terms in (48), we are left with the terms in (46) and (47) which, as we 
will see, can all be bounded by e. Due to the definition of the parameter rj (cf. Definition 6 ) it holds 
(note that each equivalence class contains N/Nr elements) 


T] = max 


|%]| ^ N/Nr 


< 


[/3] \S\/Nr - \S\/Nr 


N 


Thus, ryjS'l < N, which yields that in order to verify that the first term in (46) is smaller or equal to e it 
is sufficient to show that 


«lo 6 “ 


7N 

e 


< NrNu 


whereas for the second term in (46) to be smaller or equal to e it suffices that 
'7N‘^\ 


USNr^ 2 


sNrJ 


< NRNf 


The term \og{7/eN r) is dominated by 21og(7A^/e) and, therefore, we can can set 
u ■= S^/ 8 Nr/\/ r]\S\ 

in order to ensure that (50) is already implied by (49). In order to show (49), we set 


(49) 


(50) 


(51) 


a/a: log(eN/£) 

where K > 0 will be chosen below. With this the left-hand side of (49) can be estimated as 




H) 


j = 64A'?7|S'| log 


eN 

s 


)'-.■(¥) 


< 64Ariog^(7)r7|S'| log^ 


f) 
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Due to the assumption (25) we have 
iVfliVt>C7?|5|log3 

where C > 0 is a suitable constant. By setting C = 64itriog^(7), we establish (49) (and, hence, also (50)) 
which means that both the first and the second term in (46) are bounded by e. In order to ensure that 
the last term in (46) is bounded by the second one (and, hence, is also smaller or equal 

to e), it suffices to have {2/^/3— l)‘^Nt > u\/Wt/%\/2 which, recalling the definition of u, is equivalent to 




Since by definition of the parameter rj we have r]\S\/Nii > > 1 and, moreover, Nt > 1, anyway, 

this latter condition is fulfilled as long as 5 < 4(2/\/3— 1)^ which is fulfilled whenever K is sufficiently 
large. Since log(eA^/£) > 1, and, hence, S < we have for the first term in (47), recalling that 

= %S^NRlr\\S\, 


2N exp 1^- 
= 2N exp 

< 2N exp 


32 m 2 ^|S'| J 

\ 256 


= 2N exp 


256(52 J 


1 


2 


^/K\og{eN/^ 


Klog 




2N exjp 


1 

^6 1 


\/K 


1 ^ 

1 log 



Therefore, in order to show that the first term in (47) is bounded by e, it is sufficient to ensure that 
{y/K — 1)^ > 256 = 16^ which holds true if > 17. The second term in (47) is bounded by the first 
one. Finally, the condition S < 1/2 is satisfied since (5 < XjyfK < 1/17. This finishes the proof Theorem 
7(a). □ 


3.1.3 Proof of Theorem 7(b) 

For part (b) it remains to derive the stated approximation property of the solution to the debiased 
LASSO with respect to xs. From the proof of part (a) it follows that, with high probability (as stated 
in the theorem), the columns of the matrix As are linearly independent. Hence, the solution to the 
least squares problem (20) is given by = {A’^As)~^A'^y. Indeed, this formula can easily be verified 
by calculating the corresponding optimality conditions. Since x is exactly sparse and supported on the 
set S it holds y = Agxs + n, i.e., 

z*=xs + (A*sAs)-iA>. 

Now, by a standard inequality between the £2 and the £oo-norm, 

11^# - = ||(AJA5)-U*sn||2 < yi||(A*sAs)-'A>|U. (52) 

Since we may assume that we are in the event where condition (C 3 ) holds true, we have the estimate 

||(AjAs)-iA>|U = ||(i;is)-'A;n|U < 2^ = 2 ay/ 2 J^i(W) = ^^^=y2biM, 

where, in the second but last step, we plugged in the definition of /r from (40). A combination of this 
latter inequality with (52) implies the desired approximation property. □ 
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3.2 Recovery via basis pursuit denoising under sparsity defect 

We replace the original constraint \\Ax — y ||2 < of the basis pursuit denoising program (11) by the 
equivalent constraint \\Ax — y \\2 < Q, where g = g/^NxNjiNt. For the proof of Theorem 8 we use a 
general recovery result for basis pursuit denoising stated as Theorem 21 below. The original result [14, 
Thm. 4.33] uses an inexact dual certificate which in our case can actually be taken as the canonical exact 
dual certificate given by 

h = As{aIAs)~^ sgn{xs). 

Hence, the following Theorem 21, as it is formulated here, is a special case of [14, Thm. 4.33], see also [19]. 
The constants Ci and C 2 appearing in (54) below can be calculated explicitly using the corresponding 
formulas in [14, Thm. 4.33]. However, we refrain from doing so at this point. 

Theorem 21 (cf. [14, Thm. 4.33]). Let x G be given with s largest absolute entries supported on 
S C [N], Assume that measurements y — Ax + n S C"* are given with m < N and ||n II2 < g- If 

- Id|| 2^2 < ^ < 1, |[AscAs(2sAs)"^ sgn(a:s)||oo < 1, 
and, moreover, for 01,02 > 0 , 

max|[A 5 A ^||2 < oi, |[As(AsAs)"^ sgn(a;s )|[2 < 02 \/s, (53) 

then each solution to the basis pursuit denoising program 
min||z||i subject to \\Az — y\\ 2 <g 

Z 

satisfies 

- a ;||2 < Cl inf ||a;'- a;||i + C 2 \/se, (54) 

s-sparse x' 

for some constants Ci and C 2 depending only on (5, 01 , 02 . 

Remark 22. The error estimate (54) is slightly weaker as the one that holds under the RIP, see (12). In 
fact, (54) features the £ 2 -norm on the right hand side, but (12) states an error estimate in the ii-norm. 
This seems the price one has to pay when not working with strong recovery conditions such as the RIP. 


3.2.1 Proof of Theorem 8 

In order to prove Theorem 8 we show that the assumptions of Theorem 21 are fulfilled with high 
probability. This can be achieved by simply repeating some of the arguments we already used during 
the proof of Theorem 7. Indeed, recalling the event As,u (see (45)) from the proof of Theorem 7, where 
we now choose <5 = 1/2 and u as in (51), then it holds (for details see the proof of Theorem 8) 

As,u c {[[AgAs - Id|| 2^2 < 1 / 2 } n {max|[ 2 e ||2 < 2/^3}. 

O^G 

This means that, in case of the event As,u it holds 

max |[AgAe II 2 < |[245||2^2 max ||Ae ||2 < 2 yTTT/^/v^ = V^, 
and, moreover, 

||As(A5As)"^sgn(a:s)||2 < ||As|[2^2||(^s^s)"^||2^2|| sgn(a;s)|[2 < \/s = 

As a consequence, by setting ai = \/2 and 02 = the assumptions in (53) are automatically fulfilled, 
provided that the event As,u occurs. Finally, as the proof of Theorem 7 shows, it holds P(-'. 45 ,„) < 3e 
and (by Lemma 16) 

P(|[A 5 .As(A 5 ls)"^sgn(a;s)||oo > 1/4 | As^u) < e. 
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This is also a consequence of the special choice of u, depending on 5, and the condition (28) on the 
number of targets s from the assumptions of Theorem 8. Thus, using the union bound, the probability 
that the assumptions of Theorem 21 fail to hold are bounded by 

lP(-'^5,u) + P(||Agds(^s^s)“^ sgn(a;5)||oo > 1/4 | As,u) < 3£ + e = 4£. □ 


3.3 Proof of Proposition 13 

We now show Proposition 13 concerning the well-conditionedness of the submatrix • In what 
follows we write for the a-th entry of the signal vector In order to analyze the quantity 

— Id|| 2->.2 we write the matrix product as (see Appendix C, (76) for a proof) 

Nt Nt 

(55) 

a,h—l 


where is an |S'[/ 3 ]| x |S'[/ 3 ]| matrix which, for indices 0,0' S 0 = (/3,r,/), O' = 

is given entrywise by 




(56) 


and where is defined in (29). By subdividing the sum in (55) into an off-diagonal part containing the 
summands with {i,a) {j,b) and a remaining diagonal part, and then applying the triangle inequality, 

it follows that 




< 

r Nt Nt 1 

-Id 

+ 

Nt Nt 


*“ i—1 a—1 ^ 


2-)-2 

2 ,jf=l a,b—l 


(i,a)/(j,b) 


= : ||>^=-Id||2^2 + ||y^||2^2. (57) 


A tail bound for ||y= — Id|| 2->2 

By plugging in the definition of the entries from (56) one finds 


Nt 


Nt 


— 1 2277 ' 


]0,0' = Sr'.r{NTNt) ^6 


f'-f. 


i—'i a—1 


Nt 

E- 

a=l 


f'-f 


Nt 


(“-!) g*27r.dTA^(/3'-/3)(i-l) _ 


i=l 


i.e., the sum over all matrices yb’“)>b’“) is the identity matrix. Therefore, we can write the matrix 
- Id in (57) as a sum over random matrices, namely 


Nt Nt 


= X(..a) = |S(*.a)P-l. 




(58) 


{i,a) 

The random matrices Y have block diagonal structure. To see this, we partition the index set 5[^] 
as 


^m= U /')€%]: T = r}. 

TG[Aft] 

Recall that the matrix y(*’“)4®’“) is given entrywise by 
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The appearance of the factor Sr.r' implies that the matrices (and, hence, also have a 

block diagonal structure. In case of the matrices Y one obtains — enumerating the indices [/3, r, /) € 
S'[/ 3 ] adequately — 


{i,a) 


Ai' 


(i,a) 




, (*.“) 
^Nt 


Ab.a) 


(59) 


where the vectors Drare given entrywise as 


(/3,r,/)G5f^]. 


Note, since for particular choices of r G [Nt] the index sets might be empty, it is possible that some 

blocks appearing in (59) might be zero-dimensional. In order to analyze the random variables 

in (58), we use the following result, taken from [30]. Below we write Amax(-) to denote the maximal 
eigenvalue of a given (self-adjoint) matrix. 

Theorem 23 (cf. [30, Thm. 6.2]). Consider a finite sequence {Xk} of independent, random, self-adjoint 
d X d matrices. Assume that 


EXk = 0 and EXl 4 for p = 2,3,4,... 

Compute the variance parameter 


= IIE^ 


2 II 

^112^2' 


(60) 


(61) 


Then the following chain of inequalities holds for all t > 0; 


V Xfc > t < dexp - E „ < < , 

V ) ~ ) ~ \ + -\ dexp(-t/4i?) 


ff 12 ^ ^ |dexp(—t^/4cr^), fort<a'^/R, 

for t > a'^lR. 


(62) 




In our case the matrices Xk are given by the matrices Y ’ being defined in (59). Clearly these 

(z,a) 

matrices are self-adjoint and, moreover ET = 0. In order to provide for all further inequalities in 
(60) we first calculate powers of the block matrices in (59): 


[A0.a)]P ^ ^ 


I 5, 


P-2 


m 


NrNt 


[a0.“)] 


(63) 


The random variables X(i,a) in (59) are subexponential with 

P(IX(i.a)l >t)= P(|ls(i_a)P P(ls(i,a)l^ > t - 1) < = ee"*, 

where the second inequality follows from Lemma 28. Using basically Fubini’s Theorem and a change of 
variables (see, e.g., [14, Prop. 7.1]) and, furthermore, plugging in the tail estimate from above, we obtain 
for p > 2, 


poc poo 

Elx(*,a)r =P i^"^P(IX(qa)| >t)dt<ep C 
Jo Jo 


^e * dt = 2e^. 


(64) 


For the last equality we used that the appearing integral is equal to r(p) = (p — 1)!, where F denotes the 
well-known Gamma function. Finally, by combining (63) with (64) and recalling (59) we observe that 




P-2 


N'Y'Nt 


[v^y ^ 
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With the matrices y^y in place of the matrices Ak, the quantity (61) can be calculated as 


U2 = 


Nt Nt 




i—l a—1 

where we used that 


= 2e 


2->-2 


Nt Nt 


'y y y(i,a),(i,o)y-(i,o),(i,a 


i—l a—1 


= 2e 


maxT-g[7Vt] |5'j 


[/3]l 


2->-2 


NrNt 


r Nt Nt 

y y Y{i,a) 

*- 2=1 a—1 


0 , 0 ' 


50,0' 


NrNt' 


(The proof is analog to the identities in (81), see the appendix.) A direct application of Theorem 23, 
where the first bound in (62) applies, yields 


P(||>^= 


Idll 


> 5/2) < |%]|exp - 


{5/2)^NTNt 


4- 2emax^g[y,^] \SL 


[/3]l 


< l^i/jjlexp 


5'^NTNt \ 


(65) 


A tail bound for ||3^^||2->2 

In order to derive a tail estimate for the quantity || 2 -!- 2 , we examine the moments ]E||3^^||2™2! rn gN. 
We use decoupling (see, e.g., [14, Thm. 8.11]) in order to obtain 




2m 17? 

0^0 = E. 


Nt Nt 

E E 

i,j — l a,b—l 






2^2 


Nt Nt 

E E 

i,j—l a,h—l 


A- .ylhaldjA) 


, (66) 


2->-2 


where the ^(i,a)j {,)> G [A^t] x [At], are independent standard complex Gaussian random 

variables. The term on the right-hand side can be estimated by means of the following Khintchine-type 
inequality for a homogeneous matrix-valued chaos of order two taken from [25]^. Although the original 
result [25, Thm. 6.22] is formulated for independent Rademacher random variables, a careful observation 
of the proof reveals that it is also true for standard complex Gaussians. 

Lemma 24. Let Bk,e, k,i = 1,... M, be complex matrices of the same dimension and let f,k, Ce> k,i G 
[M], independent standard Gaussian random variables. Then, for m S N, 


E 


M 

E ^kfeBk,e 

k,l=l 


2 m 

S2r, 



where F,F are the block matrices F = {Bk,i)^i^i and F = respectively. 

In our setting, the matrices Bk,i are given by the matrices Therefore, F and F appearing 

in the assertion of Lemma 24 are block matrices consisting of |5'[,g]j x lS'[,g]j blocks. In the following we 
calculate the Schatten 2m-norms of the matrix F. To this end we compute the eigenvalues of F*F. 
Like the matrix F itself also the product F*F consists of jS'[,g] j-dimensional blocks. A straightforward 
calculation using the definition of the matrices reveals (see the end of Appendix G for the details) that 
the block at position (i, a), (j, b) of this matrix product is entrywise given by 


r 171* iT'i AT~'^ 
-^10,0’ — °T’+b,T+a^''T 




^ Note that due to the comment after the formulation of the original result in [25] the extra factor appearing 

right-hand side of the assertion in [25, Thm. 6.22] can be removed. In the original version of [25, Thm. 6.22], the quantity 
||A|||A was missing; this has been corrected in a new version which can be obtained from the personal website of HR. 
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Due to the appearance of the factor the matrix F*F can be transformed into a block diagonal 

matrix by merely rearranging the ordering of the rows/columns. Hence, we assume without loss of 
generality that F*F is block diagonal, with the blocks Ai, A 2 ,..., on the diagonal, where each 
block Ak is a rank one matrix of the form 


A. = 


l^ll . 

-“j 

NtN^ 


and where for each n € [At] the vector is entry wise given by 

K]*.e = ^ 0 

Hence, HifKlli = -^t|-S'[; 3 ]| and, therefore, has exactly one nonzero eigenvalue, equal to 

This implies that also the block A^ has exactly one nonzero eigenvalue which is equal to and, 

thus, ||A„|| 5 ^ = |5'[^]p/A(^. With this we can calculate. 




Ke[Wt] 


= Nt 


\Smr 

N‘2m 


= I‘ 5 '[/ 3 ]I 


j\^27n— 1 


— I [0]' Arm 


( 68 ) 


where we used that, without loss of generality, |5[^]| < Nt- By using the same arguments for the matrix 
F it is straightforward to verify that also 

II^IIIT™<I%]I^|P (69) 

holds true. 

We proceed with estimating the first two quantities in the maximum expression in (67). As we 
calculate in Appendix C (see (81)), 


Nt Nt 


Nt Nt 


E E 


),U,b) _ 


l^ll 


i,j — l a,b—l a,6=1 

Therefore, the corresponding Schatten 2m-norm in (67) can be calculated as 


Nt 


Id. 


Nt Nt -|l/2 2m 

E E 

*- ij — l a,6—1 S2m 


Nt Nt -|l/2 2m 

E E iv'"> 

^i,j — la,b—l S2m 


NF 


Now a direct application of the Khintchine-type inequality (see Lemma 24 above) to the moments in 
(66), using (68), (69), and the latter equalities implies 




(2m)! 

2""m! 


I%]l 


Njn 


™'lwsrj 


(70) 


Using Stirling’s formula, one can easily verify that 

^ < V2i2lerm^, 

Therefore, the moments in (70) can further be estimated as 




2 m 


(71) 


which holds true for m S N U {0}. Holder’s inequality implies 

E|Z|2m+2e ^E^| 2 ’|(l-e) 2 m| 2 -|e( 2 m+ 2 )j < |2m^ 1-0 2m+2 ^ 
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for any random variable Z, and each 6 G [0,1]. Combining this with (71) gives 


2 m -\-26 

For m £ N we estimate the last terms as 

1 \ 2e(i-e) 

-1 < V2im + , (72) 

where we used the inequality between the (weighted) arithmetic and geometric mean and, furthermore, 
used the fact that (m + 1 )/to < 2 and 6{1 — 0) < 1/4. Since 

: X > 0} = > 3/5, 

we can replace \/2 on the right-hand side in (72) by 3 leaving us with an inequality which is now also 
valid for m = 0. Altogether it holds for all real p = 2m + 26 > 0, 

(6|%]|)'V 

In order to obtain a tail estimate for the random variable || 2->'2 it only remains to apply the following 
lemma which is a direct implication of Markov’s inequality. For a proof see, e.g., [25, 14]. 

Lemma 25. Let Z be a random variable and suppose there exist constants a,f3,j,po > 0 such that 

(E|^|p)i/p < a/31/PpV^, 

for all p > po- Then it holds for all u > Pq^'* , 

¥{\Z\ > e^/'^au) < 

By applying Lemma 25 we obtain 

P(||3^^||2^2 > S/2) < 6|5[^]| exp ■ (73) 


^2m(l-e)(^ ^ l)(2m+2)e ^ + 


m 






Conclusion 

Finally, due to the definition of the matrices y^, y= in (57), an application of the union bound and 
plugging in the tail bounds (65), (73) (noting that (65) is dominated by (73)) implies the assertion of 
Proposition 13. □ 


4 Numerical experiments 

We conduct numerical experiments in order to test the effective empirical probabilities of perfect recon¬ 
struction of support sets using the LASSO functional, as considered in Theorem 7. Due to runtime 
limitations we only perform simulations for the Doppler-free scenario. As pointed out in in Section 1.6, 
this case is contained in our analysis as a special case. This means in particular that the same influence 
of the parameter r] on the recovery properties can be expected. This is indeed supported by the results 
of the numerical experiments described next. 

We consider the case where Nt = Nn = 8 and Nt = 64, yielding a problem dimension of A^ = NTN^Nt = 4096 
and a total number of m = NjiNt = 512 measurements. We run simulations with varying choices for 
(l^ljTy), where [S'] is the number of targets and rj defines the actual balancedness of the considered 
support sets (see Definition 6). For each such pair (jS’/yy) we try to reconstruct 1000 independent ran¬ 
dom target scenes x from measurements y = Ax -|- n by minimizing the LASSO functional (19) with 
A = 2a\j2NTNRNt log(A^), and where, in each run. 
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sparsity 


Figure 2: Probability of perfect reconstruction of support sets for varying sparsity levels |S'|. Blue 
plots: Randomly chosen support sets having constant balancedness parameter rj as indicated. Red plot: 
Randomly chosen support sets with no restriction on the balancedness parameter. 


• the signal vectors Si (generating the measurement matrix A) and also the noise vector n are drawn 
randomly from standard complex Gaussian distribution, i.e., we choose cr^ = 1 as the variance of 
the noise, 

• the support set 5 of a; is drawn uniformly at random from all possible support sets having bal¬ 
ancedness parameter 77 , 

• the phases of the nonzero entries of x are drawn uniformly at random from [0, 27r), 

• and the amplitudes \xe\, O £ S, are set to the threshold level defined in (26). 

We use the Matlab library TFOCS [2] for the minimization of the LASSO functional (19). Since 
TFOCS uses an iterative solver and, hence, in general only approximations to the desired minimizer can 
be computed, and we consider a target scene x to be successfully recovered if \\x — a;#||oo < T for an 
appropriate thresholding T > 0. 

In Figure 2 we depict the results of our simulations. It becomes evident that small values of the 
parameter 77 have a positive impact on the probability of perfect reconstruction of the support set S. 
Moreover, as predicted by formula (25), the dependence of the number of maximal reconstructable targets 
jS”! (for a fixed number of measurements NuNt) on the parameter rj is linear. 
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Appendix 

A Orthogonality of the matrices Xq 

Due to the definition in (31), for {i,j) S [Nr] x [Nr], the ( 7 ,j)th Nt x Nt block of AT© is given by 
Lemma 26. The set of matrices | NrNt ^ ^ orthonormal basis. 
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Proof. We calculate the inner product {Xq', Xq) = Tr(AfQ,A’e). To this end we calculate the {j',j)th 
block of the product Xq, Xq (recall, Xq, Xq is a Nt x Nt block matrix consisting of Nt x Nt blocks), 


Nr 


Nr 




/c=l 

Nr 


/c=l 


_ gi27r'dB;S'A,3(fc-l)gi27r'dT/3'A3(j'-l)gj27r'dH/3A3(fc-l)gi27r'dT/3A;3 0'-l) 


k=l 


— g—i27r-dT/3'A3(j'-l)gi2ir-dT/3A3(i-l)gi27r'-iTgi^r' ^i2-K-dR(P-l3')Afi(k-l) ^^ 

(74) 


Affl 


k=l 


For the last equality we used that 

which follows directly from the definitions of the operators Mf and Tr (see (6)). Due to (74), the 
Frobenius inner product between two matrices is given as 


Nt 

(AreqAfe) = £Tr([A’^,Afe]b’^l) 
f=i 

N'T 


/ Nt \ ^ Nr 

— I ^ g* 27 r-dT(/ 3 -/ 3 ')A; 3 (j-l) | g» 27 r- ^ r' ^ Tl{M f - fiT t-t') ■ 

^ j=l ^ fe=l 


Since M /_// is a diagonal matrix, the trace of the product Mf-fiTr-r' can only be nonzero if at least 
one of the diagonal entries of the matrix T^-t' is nonzero, i.e., if t = r' so that T^-t' = Id. This means 
that 


Nt 

TT{Mf.f,TT-r') = Tr(My_/0 = ^ 


Nt if/ = /', 
0 otherwise. 


Recalling the formula for {Xqi , Xq) from above implies that for this inner product to be nonzero it neces¬ 
sarily has to hold that O' = O. Indeed, this follows from the appearance of the factor 
which (recalling that dr = 1/2 and = 2/NTNji, see (1), (4)) is only nonzero (and equal to Nt) if 
fi' = p. Finally, we can conclude 


{Xqi, Xq) = 5q'^qNtNhNi. 


The normalization yields the result. □ 

B Proof of Lemma 11 

For small u the first term on the right-hand side of (37) can be obtained by a volumetric argument. To 
this end, let for S' C {1,..., TV}, i ?5 C denote the set of all vectors x with ||a ;||2 < 1 and support in S. 
Introducing |||a;||| := ||I4||2->.2 we find using (35) and a volumetric estimate, see e.g. [14, Proposition C.3], 

A7(Rs,|||.||U)<AA(R5,V^||-||2,w) < + <( 3 ^^) , 

where for the last inequality we used the assumption u < y^T/Nt. Since A is the union of all sets 3$ 
with S C [N], and there are (^) < (efV/s)® possible choices for S it holds 

A7(A|| • 112^2,^^) < (eiV/s)«(3\^7]^/u)'^ 
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which implies the first bound in (37), namely 


logAr(A|| • 112 ^ 2 , w) < 2 s |^log(eA^/s) + log < slog 

For the second bound from the assertion we exploit the fact that 

{x G : X s-sparse, ||a ;||2 < 1} C V^conv {ee, lee, —eg, —zee} =: Ds 

eee 

and, hence, the set A from (33) is contained in the set {Vx : x G Dg}- The following is a version of 
Maurey’s lemma. For a proof see, e.g., [20]. 

Lemma 27. There exists an absolute constant c for which the following holds. Let X be a normed space, 
consider a finite set 14 C X of cardinality N, and assume that for every L G N and {ui,... ,U[f) G , 
< A\/L, where (ei,..., eif) denotes a Rademacher vector. Then for every u > 0, 

logA/'(conv(W), II • ||x,m) < c{A/uf\og{N). 

In order to apply Lemma 27, we need to estimate the quantity Ee|| X]fe=i ^kVuk l| 2 ->- 2 , where (mi, ... ,ul) 
is a sequence of extreme points in Dg and e = (ei,..., cl) is a Rademacher vector. The noncommutative 
Khintchine inequality [3, 25] - originally due to Lust-Piquard [23, 24] - or more modern estimates based 
on moment generating function bounds [30], see also [14, Problem 8 . 6 (d)], 

L ^ L L -x 1/2 

l|2-»2 ^ \/log(fVmax) max < || 14^. || 2-i.2, || || 2-»2 , (75) 

k^l ^ fc— 1 k^l ^ 

— — * ^ ^ 

where fVmax stands for the maximum of the dimensions of the matrices and and can 

be estimated by max{iVfl;A^(, < N. Using the estimate (35) for ||V}ij,|| 2 ->. 2 , 

\\VuXj2^2 = ||K,KJ| 2^2 = IIKJIL 2 < 

An application of the triangle inequality yields, using the Khintchine inequality (75), 

E.||^efcKJ| 2^2 < 

k—1 

Finally, we can apply Lemma 27 yielding 

logA/'(A, II • I| 2 ^ 2 ,m) < ^^log^( 77 ). 

u^rst 

This establishes the second bound in (37). □ 


C Basic calculations for Proposition 13 

The proof of Proposition 13 is based on the fact that 


Nt Nt 


"Hi; 


(76) 


i,j—l a,6=1 


where we write S(i,a) for the o-th entry of the signal vector Si and where for 0, O' G S'[^] the corresponding 
entry of a given matrix jg given by 




( 77 ) 
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To see this we recall that, according to (22), the inner products (A©, A©/) satisfy 


= (^ 01 ^ 0 ') = {NTNnNt) ^(Ae,Ae/) 


Nt 


= (NTNt)-^ e 
j.i=i 


i2Tr-dTAi3[P'{j-l)-l3{i-l)] 


{M fTrSi,M fiTr/Sj) 


where we used that both 0,0' G S'[/ 3 ]. Recalling the definitions of the operators Mf, Tr (see (6)) one 
obtains for the latter inner prodnct, 


Nt 


Nt 


{MfTrSj, Mf>Tr’Sj) = [MfTT-Si]k[Mf'Tr'Sj]k = [si]fe_T-[sj]fc_,- 


fc=i 

N, 


^ g»27r-^£j^(a+T-l) 


~ '^a-b,T'- 


k=l 




;,b=l 


By combining the identities from above one finds 

Nt Nt 


i,j — la,b—l 

= [’5"(’'“)’«'')]e,e',see (77) 

which shows (76). 

The matrices allow for a simple formula for their adjoints, namely 


I'ylboldi.f'lj* _ y(j,b),(i,a) 


(78) 


The product F*F 

The matrix F consists of the blocks given by (77). Therefore, F is self-adjoint so that 

F* F = F^. Like F also the product F'^ consists of blocks and the block at the (i, a)-th (block) row and 
the (j, &)-th (block) column is given by 


Nt Nt 

[_p2j(i,a).(i,b) _ ^ ^ y(i,o),(r.ij)y(r.( 3 r).(i.b)^ 

r—1 q—1 

Recalling (77), the appearing summands Y'b’“h(’'.9)y(»’,7).(7,&) g^j,g gjyen entrywise by 

\'y(i,a),(r,q)'y'{r,q),(j,b)i _ \ ' y (»,“),(>'><?) y O'.f') 

1 0.0 0 . 0 ' 

0 GS[f)] 

= (N'rNt)~'^ 'S~^ _g*2-n--4^(''‘+“-l)g*2-n--'T^('f-l-<?-l) i2ir.dTA3[/3'(j-l)-/3(i-l)] 

\ -I- tj / ^ '^a—q,f—T q—b,T' — f^ 


(79) 


0 eS| 


[/3] 


= iNTN,)-X^llr+a\SlpY~>“^'^ 

Combining this with (79) yields 

Nt Nt 


y+a-<?|p*27r.'t^(a+r-l)gi27r.dTA3[/3'(j-l)-/3(i-l)] 


(80) 




r—1 q—1 
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Properties of the matrices 

The proof of Proposition 13 uses the identities 


Nt Nt 


Nt Nt 


E E 


i,j — l a,b—l 


i,j — l a,b—l 


l%]l 

Nf, 


Id. 


(81) 


Due to (78), and by plugging in the identity we used in the second step of (80), the second sum is given 
entrywise by 


Nt Nt 


Nt Nt 


= E E 




r+a—& I 


i,j—l a,b—l 


{NrNt)^ 


g*2ii--£)^(T+a-l)gi27r-dTA;3(^'-/3)(i-l) 


= Sr'T 


{NTNtf'' 


Nt 

E 


i2-k- ^^ ^ (q~1) 


i,j—l a,b—l 
Nt 

y^ gZ27i-dTA3(/3'-/3)(z-l) _ 


l%]l 

Nt ’ 


a—1 i—1 

which establishes the second equality in (81). The first equality follows due to symmetry. 


D Basics from probability theory 

A complex-valued random variable ^ is standard complex Gaussian iff it has (complex) density , or, 

equivalently, ^ can be written a.s ^ = x + ty, where x,y ^ N{0, 1/2) are independent standard Gaussian 
random variables. More generally, a mean-zero complex Gaussian random variable with variance cr^ is 
of the form cr^, where ^ is a standard complex Gaussian. 

Lemma 28. For a standard complex Gaussian random variable ^ it holds 

P(ICI>^)<e-‘^ 

For a standard complex Gaussian random vector ^ (having independent, standard complex Gaussian 
entries) and a (deterministic) complex vector a of the same dimension, the random variable 0 := (ot,^) 
is mean-zero complex Gaussian with variance ||a|| 2 - This implies the next statement. 

Lemma 29. For a standard complex Gaussian random vector ^ and a complex vector a of the same 
dimension it holds 

P(|(a,OI > t) = P(l?l > Vll«ll2) < 

For a 2n-dimensional standard Gaussian random vector g we have, due to [14, (8.89)], 

Pdlslb > ^Nn + t) < e“* 

Since an n-dimensional standard complex Gaussian random vector ^ can be considered as a (real-valued) 
2n-dimensional standard Gaussian random vector g with independent entries from ^(0,1/2), we have 
the following lemma. 

Lemma 30. For an n-dimensional standard complex Gaussian random vector ^ it holds 

Pdl^lb > Vn + t) = P(||2“^/^g||2 > Vn + t) =P(||g||2 > V^+V^t) < e~*\ 

Finally, the following lemma states a Hoeffding-type inequality for Steinhaus sequences. Recall that 
a Steinhaus sequence is a sequence of independent random variables which are all distributed uniformly 
on the complex unit circle {z & C : \z\ = \\. 

Lemma 31 ([14, Cor. 8.10]). Let a G and e = (ei,..., cl) be a Steinhaus sequence. Then 
P(Ka, e)| > u\\a\\2) < 2e““''/^. 
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